Mixed-lingual spoken word recognition by using VQ codebook sequences of variable length segments

نویسندگان

  • Hiroaki Kojima
  • Kazuyo Tanaka
چکیده

We are investigating unsupervised phone modeling. This paper describes a derivation method of VQ codebook sequences of variable length segments from spoken word samples, and also describes evaluation results by applying the method to mixed-lingual speech recognition tasks which include non-native speakers. The VQ codebook is generated based on a piecewise linear segmentation method which includes segmentation, alignment, reduction and clustering processes. Derived codebook sequences are evaluated by speaker independent recognition of a word set which is a mixture of English and Japanese word. Speech samples are uttered by both English and Japanese native speakers. The recognition rates of mixed-lingual 618 words by using a codebook consist of 128 codes are 89.7% for English native speakers and 79.4% for Japanese native speakers in average .

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extracting phonological chunks based on piecewise linear segment lattices

The task of our research is to form phone-like models and a phoneme-like set from spoken word samples without using any transcriptions except for the lexical identi cation of each word in a vocabulary. This framework is derived from two motivations: 1) automatic design of optimal speech recognition units and structures of phone models, and 2) multi-lingual speech recognition based on languagein...

متن کامل

A Vector Quantization Approach to Speaker Recognition

CH2118-8/85/0000-0387 $1.00 © 1985 IEEE 387 ABSTRACT. In this study a vector quantIzation (VQ) codebook was system. In the other, Shore and Burton 112] used word-based VQ used as an efficient means of characterizing the short-time spectral codebooks and reported good performance in speaker-trained isolatedfeatures of a speaker. A set of such codebooks were then used to word recognition experime...

متن کامل

YLAB@RU at Spoken Term Detection Task in NTCIR-9

The information retrieval based on speech recognition is an important technique to easy access to large amount of mul-timedia contents including speech. The development of spoken term detection (STD) techniques, which detect a given word or phrase from spoken documents, is widely conducted. This paper proposes a new method of STD based on the vector quantization (VQ). Spoken documents are repre...

متن کامل

Variable dimension VQ encoding and codebook design

A variable dimension vector quantizer (VDVQ) has codewords of unequal dimensions. Here, a trellis-based sequential optimal VDVQ encoding algorithm is proposed. Also, a VDVQ codebook design algorithm based on splitting a node with equal or reduced dimensions is proposed that does not require any codebook parameter to be prespecified unlike known schemes. The VDVQ system is shown to outperform a ...

متن کامل

VQ-faces - unsupervised face recognition from image sequences

In this paper we propose a new method for unsupervised face recognition – VQ-faces, which operates on a sequential stream of face images and is able to handle both frontal and side-view faces at the same time. The method consists of two parts: in the first part, the VQ-faces are calculated as prototype vectors of local areas in image-space, coding for different face-views (i.e. a “view codebook...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003